Pesquisa | Portal Regional da BVS

1.

GenePainter: a fast tool for aligning gene structures of eukaryotic protein families, visualizing the alignments and mapping gene structures onto protein structures.

Hammesfahr, Björn; Odronitz, Florian; Mühlhausen, Stefanie; Waack, Stephan; Kollmar, Martin.

BMC Bioinformatics ; 14: 77, 2013 Mar 04.

Artigo em Inglês | MEDLINE | ID: mdl-23496949

RESUMO

BACKGROUND: All sequenced eukaryotic genomes have been shown to possess at least a few introns. This includes those unicellular organisms, which were previously suspected to be intron-less. Therefore, gene splicing must have been present at least in the last common ancestor of the eukaryotes. To explain the evolution of introns, basically two mutually exclusive concepts have been developed. The introns-early hypothesis says that already the very first protein-coding genes contained introns while the introns-late concept asserts that eukaryotic genes gained introns only after the emergence of the eukaryotic lineage. A very important aspect in this respect is the conservation of intron positions within homologous genes of different taxa. RESULTS: GenePainter is a standalone application for mapping gene structure information onto protein multiple sequence alignments. Based on the multiple sequence alignments the gene structures are aligned down to single nucleotides. GenePainter accounts for variable lengths in exons and introns, respects split codons at intron junctions and is able to handle sequencing and assembly errors, which are possible reasons for frame-shifts in exons and gaps in genome assemblies. Thus, even gene structures of considerably divergent proteins can properly be compared, as it is needed in phylogenetic analyses. Conserved intron positions can also be mapped to user-provided protein structures. For their visualization GenePainter provides scripts for the molecular graphics system PyMol. CONCLUSIONS: GenePainter is a tool to analyse gene structure conservation providing various visualization options. A stable version of GenePainter for all operating systems as well as documentation and example data are available at http://www.motorprotein.de/genepainter.html.

Assuntos

Éxons , Íntrons , Proteínas/genética , Alinhamento de Sequência/métodos , Software , Mapeamento Cromossômico/métodos , Gráficos por Computador , Eucariotos/genética , Evolução Molecular , Modelos Moleculares , Análise de Sequência de Proteína

2.

Peakr: simulating solid-state NMR spectra of proteins.

Schneider, Robert; Odronitz, Florian; Hammesfahr, Björn; Hellkamp, Marcel; Kollmar, Martin.

Bioinformatics ; 29(9): 1134-40, 2013 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-23493322

RESUMO

MOTIVATION: When analyzing solid-state nuclear magnetic resonance (NMR) spectra of proteins, assignment of resonances to nuclei and derivation of restraints for 3D structure calculations are challenging and time-consuming processes. Simulated spectra that have been calculated based on, for example, chemical shift predictions and structural models can be of considerable help. Existing solutions are typically limited in the type of experiment they can consider and difficult to adapt to different settings. RESULTS: Here, we present Peakr, a software to simulate solid-state NMR spectra of proteins. It can generate simulated spectra based on numerous common types of internuclear correlations relevant for assignment and structure elucidation, can compare simulated and experimental spectra and produces lists and visualizations useful for analyzing measured spectra. Compared with other solutions, it is fast, versatile and user friendly. AVAILABILITY AND IMPLEMENTATION: Peakr is maintained under the GPL license and can be accessed at http://www.peakr.org. The source code can be obtained on request from the authors.

Assuntos

Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , Software , Conformação Proteica

3.

diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data.

Hammesfahr, Björn; Odronitz, Florian; Hellkamp, Marcel; Kollmar, Martin.

BMC Res Notes ; 4: 338, 2011 Sep 09.

Artigo em Inglês | MEDLINE | ID: mdl-21906294

RESUMO

BACKGROUND: Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses. RESULTS: diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data. CONCLUSIONS: diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at http://www.diark.org.

4.

Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology.

Pillmann, Holger; Hatje, Klas; Odronitz, Florian; Hammesfahr, Björn; Kollmar, Martin.

BMC Bioinformatics ; 12: 270, 2011 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-21718515

RESUMO

BACKGROUND: Alternative splicing of pre-mature RNA is an important process eukaryotes utilize to increase their repertoire of different protein products. Several types of different alternative splice forms exist including exon skipping, differential splicing of exons at their 3'- or 5'-end, intron retention, and mutually exclusive splicing. The latter term is used for clusters of internal exons that are spliced in a mutually exclusive manner. RESULTS: We have implemented an extension to the WebScipio software to search for mutually exclusive exons. Here, the search is based on the precondition that mutually exclusive exons encode regions of the same structural part of the protein product. This precondition provides restrictions to the search for candidate exons concerning their length, splice site conservation and reading frame preservation, and overall homology. Mutually exclusive exons that are not homologous and not of about the same length will not be found. Using the new algorithm, mutually exclusive exons in several example genes, a dynein heavy chain, a muscle myosin heavy chain, and Dscam were correctly identified. In addition, the algorithm was applied to the whole Drosophila melanogaster X chromosome and the results were compared to the Flybase annotation and an ab initio prediction. Clusters of mutually exclusive exons might be subsequent to each other and might encode dozens of exons. CONCLUSIONS: This is the first implementation of an automatic search for mutually exclusive exons in eukaryotes. Exons are predicted and reconstructed in the same run providing the complete gene structure for the protein query of interest. WebScipio offers high quality gene structure figures with the clusters of mutually exclusive exons colour-coded, and several analysis tools for further manual inspection. The genome scale analysis of all genes of the Drosophila melanogaster X chromosome showed that WebScipio is able to find all but two of the 28 annotated mutually exclusive spliced exons and predicts 39 new candidate exons. Thus, WebScipio should be able to identify mutually exclusive spliced exons in any query sequence from any species with a very high probability. WebScipio is freely available to academics at http://www.webscipio.org.

Assuntos

Algoritmos , Processamento Alternativo , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Sítios de Splice de RNA , Sequência de Aminoácidos , Animais , Sequência de Bases , Proteínas de Drosophila/química , Éxons , Íntrons , Dados de Sequência Molecular , Fases de Leitura , Cromossomo X

5.

Reconstructing the phylogeny of 21 completely sequenced arthropod species based on their motor proteins.

Odronitz, Florian; Becker, Sebastian; Kollmar, Martin.

BMC Genomics ; 10: 173, 2009 Apr 21.

Artigo em Inglês | MEDLINE | ID: mdl-19383156

RESUMO

BACKGROUND: Motor proteins have extensively been studied in the past and consist of large superfamilies. They are involved in diverse processes like cell division, cellular transport, neuronal transport processes, or muscle contraction, to name a few. Vertebrates contain up to 60 myosins and about the same number of kinesins that are spread over more than a dozen distinct classes. RESULTS: Here, we present the comparative genomic analysis of the motor protein repertoire of 21 completely sequenced arthropod species using the owl limpet Lottia gigantea as outgroup. Arthropods contain up to 17 myosins grouped into 13 classes. The myosins are in almost all cases clear paralogs, and thus the evolution of the arthropod myosin inventory is mainly determined by gene losses. Arthropod species contain up to 29 kinesins spread over 13 classes. In contrast to the myosins, the evolution of the arthropod kinesin inventory is not only determined by gene losses but also by many subtaxon-specific and species-specific gene duplications. All arthropods contain each of the subunits of the cytoplasmic dynein/dynactin complex. Except for the dynein light chains and the p150 dynactin subunit they contain single gene copies of the other subunits. Especially the roadblock light chain repertoire is very species-specific. CONCLUSION: All 21 completely sequenced arthropods, including the twelve sequenced Drosophila species, contain a species-specific set of motor proteins. The phylogenetic analysis of all genes as well as the protein repertoire placed Daphnia pulex closest to the root of the Arthropoda. The louse Pediculus humanus corporis is the closest relative to Daphnia followed by the group of the honeybee Apis mellifera and the jewel wasp Nasonia vitripennis. After this group the rust-red flour beetle Tribolium castaneum and the silkworm Bombyx mori diverged very closely from the lineage leading to the Drosophila species.

Assuntos

Artrópodes/classificação , Cinesinas/genética , Miosinas/genética , Animais , Artrópodes/genética , Hibridização Genômica Comparativa , Drosophila/genética , Complexo Dinactina , Dineínas/genética , Dineínas/metabolismo , Cinesinas/química , Proteínas Associadas aos Microtúbulos/genética , Proteínas Associadas aos Microtúbulos/metabolismo , Filogenia

6.

WebScipio: an online tool for the determination of gene structures using protein sequences.

Odronitz, Florian; Pillmann, Holger; Keller, Oliver; Waack, Stephan; Kollmar, Martin.

BMC Genomics ; 9: 422, 2008 Sep 18.

Artigo em Inglês | MEDLINE | ID: mdl-18801164

RESUMO

BACKGROUND: Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. RESULTS: WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. CONCLUSION: WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at http://www.webscipio.org.

Assuntos

Análise de Sequência de Proteína/métodos , Software , Interface Usuário-Computador , Algoritmos , Sequência de Aminoácidos , Animais , Bases de Dados Genéticas , Genômica , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Especificidade da Espécie

7.

Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species.

Keller, Oliver; Odronitz, Florian; Stanke, Mario; Kollmar, Martin; Waack, Stephan.

BMC Bioinformatics ; 9: 278, 2008 Jun 13.

Artigo em Inglês | MEDLINE | ID: mdl-18554390

RESUMO

BACKGROUND: For many types of analyses, data about gene structure and locations of non-coding regions of genes are required. Although a vast amount of genomic sequence data is available, precise annotation of genes is lacking behind. Finding the corresponding gene of a given protein sequence by means of conventional tools is error prone, and cannot be completed without manual inspection, which is time consuming and requires considerable experience. RESULTS: Scipio is a tool based on the alignment program BLAT to determine the precise gene structure given a protein sequence and a genome sequence. It identifies intron-exon borders and splice sites and is able to cope with sequencing errors and genes spanning several contigs in genomes that have not yet been assembled to supercontigs or chromosomes. Instead of producing a set of hits with varying confidence, Scipio gives the user a coherent summary of locations on the genome that code for the query protein. The output contains information about discrepancies that may result from sequencing errors. Scipio has also successfully been used to find homologous genes in closely related species. Scipio was tested with 979 protein queries against 16 arthropod genomes (intra species search). For cross-species annotation, Scipio was used to annotate 40 genes from Homo sapiens in the primates Pongo pygmaeus abelii and Callithrix jacchus. The prediction quality of Scipio was tested in a comparative study against that of BLAT and the well established program Exonerate. CONCLUSION: Scipio is able to precisely map a protein query onto a genome. Even in cases when there are many sequencing errors, or when incomplete genome assemblies lead to hits that stretch across multiple target sequences, it very often provides the user with the correct determination of intron-exon borders and splice sites, showing an improved prediction accuracy compared to BLAT and Exonerate. Apart from being able to find genes in the genome that encode the query protein, Scipio can also be used to annotate genes in closely related species.

Assuntos

Algoritmos , Éxons/genética , Íntrons/genética , Proteínas/genética , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Dados de Sequência Molecular , Alinhamento de Sequência/métodos , Homologia de Sequência do Ácido Nucleico , Especificidade da Espécie

8.

Comparative genomic analysis of the arthropod muscle myosin heavy chain genes allows ancestral gene reconstruction and reveals a new type of 'partially' processed pseudogene.

Odronitz, Florian; Kollmar, Martin.

BMC Mol Biol ; 9: 21, 2008 Feb 06.

Artigo em Inglês | MEDLINE | ID: mdl-18254963

RESUMO

BACKGROUND: Alternative splicing of mutually exclusive exons is an important mechanism for increasing protein diversity in eukaryotes. The insect Mhc (myosin heavy chain) gene produces all different muscle myosins as a result of alternative splicing in contrast to most other organisms of the Metazoa lineage, that have a family of muscle genes with each gene coding for a protein specialized for a functional niche. RESULTS: The muscle myosin heavy chain genes of 22 species of the Arthropoda ranging from the waterflea to wasp and Drosophila have been annotated. The analysis of the gene structures allowed the reconstruction of an ancient muscle myosin heavy chain gene and showed that during evolution of the arthropods introns have mainly been lost in these genes although intron gain might have happened in a few cases. Surprisingly, the genome of Aedes aegypti contains another and that of Culex pipiens quinquefasciatus two further muscle myosin heavy chain genes, called Mhc3 and Mhc4, that contain only one variant of the corresponding alternative exons of the Mhc1 gene. Mhc3 transcription in Aedes aegypti is documented by EST data. Mhc3 and Mhc4 inserted in the Aedes and Culex genomes either by gene duplication followed by the loss of all but one variant of the alternative exons, or by incorporation of a transcript of which all other variants have been spliced out retaining the exon-intron structure. The second and more likely possibility represents a new type of a 'partially' processed pseudogene. CONCLUSION: Based on the comparative genomic analysis of the alternatively spliced arthropod muscle myosin heavy chain genes we propose that the splicing process operates sequentially on the transcript. The process consists of the splicing of the mutually exclusive exons until one exon out of the cluster remains while retaining surrounding intronic sequence. In a second step splicing of introns takes place. A related mechanism could be responsible for the splicing of other genes containing mutually exclusive exons.

Assuntos

Processamento Alternativo/genética , Artrópodes/genética , Modelos Moleculares , Músculos/metabolismo , Cadeias Pesadas de Miosina/genética , Filogenia , Pseudogenes/genética , Animais , Sequência de Bases , Componentes do Gene , Genômica/métodos , Modelos Genéticos , Dados de Sequência Molecular , Análise de Sequência de DNA , Homologia de Sequência , Especificidade da Espécie

9.

Drawing the tree of eukaryotic life based on the analysis of 2,269 manually annotated myosins from 328 species.

Odronitz, Florian; Kollmar, Martin.

Genome Biol ; 8(9): R196, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17877792

RESUMO

BACKGROUND: The evolutionary history of organisms is expressed in phylogenetic trees. The most widely used phylogenetic trees describing the evolution of all organisms have been constructed based on single-gene phylogenies that, however, often produce conflicting results. Incongruence between phylogenetic trees can result from the violation of the orthology assumption and stochastic and systematic errors. RESULTS: Here, we have reconstructed the tree of eukaryotic life based on the analysis of 2,269 myosin motor domains from 328 organisms. All sequences were manually annotated and verified, and were grouped into 35 myosin classes, of which 16 have not been proposed previously. The resultant phylogenetic tree confirms some accepted relationships of major taxa and resolves disputed and preliminary classifications. We place the Viridiplantae after the separation of Euglenozoa, Alveolata, and Stramenopiles, we suggest a monophyletic origin of Entamoebidae, Acanthamoebidae, and Dictyosteliida, and provide evidence for the asynchronous evolution of the Mammalia and Fungi. CONCLUSION: Our analysis of the myosins allowed combining phylogenetic information derived from class-specific trees with the information of myosin class evolution and distribution. This approach is expected to result in superior accuracy compared to single-gene or phylogenomic analyses because the orthology problem is resolved and a strong determinant not depending on any technical uncertainties is incorporated, the class distribution. Combining our analysis of the myosins with high quality analyses of other protein families, for example, that of the kinesins, could help in resolving still questionable dependencies at the origin of eukaryotic life.

Assuntos

Miosinas/química , Miosinas/fisiologia , Alelos , Animais , Evolução Biológica , Mapeamento Cromossômico/métodos , Bases de Dados de Proteínas , Evolução Molecular , Variação Genética , Vida , Modelos Biológicos , Miosinas/genética , Filogenia , Estrutura Terciária de Proteína , Especificidade da Espécie , Processos Estocásticos

10.

diArk--a resource for eukaryotic genome research.

Odronitz, Florian; Hellkamp, Marcel; Kollmar, Martin.

BMC Genomics ; 8: 103, 2007 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-17439643

RESUMO

BACKGROUND: The number of completed eukaryotic genome sequences and cDNA projects has increased exponentially in the past few years although most of them have not been published yet. In addition, many microarray analyses yielded thousands of sequenced EST and cDNA clones. For the researcher interested in single gene analyses (from a phylogenetic, a structural biology or other perspective) it is therefore important to have up-to-date knowledge about the various resources providing primary data. DESCRIPTION: The database is built around 3 central tables: species, sequencing projects and publications. The species table contains commonly and alternatively used scientific names, common names and the complete taxonomic information. For projects the sequence type and links to species project web-sites and species homepages are stored. All publications are linked to projects. The web-interface provides comprehensive search modules with detailed options and three different views of the selected data. We have especially focused on developing an elaborate taxonomic tree search tool that allows the user to instantaneously identify e.g. the closest relative to the organism of interest. CONCLUSION: We have developed a database, called diArk, to store, organize, and present the most relevant information about completed genome projects and EST/cDNA data from eukaryotes. Currently, diArk provides information about 415 eukaryotes, 823 sequencing projects, and 248 publications.

Assuntos

Bases de Dados Genéticas , Células Eucarióticas , Genoma , DNA Complementar/genética , Etiquetas de Sequências Expressas , Internet

11.

Pfarao: a web application for protein family analysis customized for cytoskeletal and motor proteins (CyMoBase).

Odronitz, Florian; Kollmar, Martin.

BMC Genomics ; 7: 300, 2006 Nov 29.

Artigo em Inglês | MEDLINE | ID: mdl-17134497

RESUMO

BACKGROUND: Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. DESCRIPTION: Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. CONCLUSION: We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein.

Assuntos

Proteínas do Citoesqueleto/química , Bases de Dados de Proteínas , Internet , Proteínas Motores Moleculares/química , Proteômica/métodos , Software , Sequência de Aminoácidos , Biologia Computacional/métodos , Proteínas do Citoesqueleto/genética , Proteínas Motores Moleculares/genética , Linguagens de Programação , Interface Usuário-Computador

12.

DNA replication in protein extracts from human cells requires ORC and Mcm proteins.

Baltin, Jens; Leist, Sandra; Odronitz, Florian; Wollscheid, Hans-Peter; Baack, Martina; Kapitza, Thomas; Schaarschmidt, Daniel; Knippers, Rolf.

J Biol Chem ; 281(18): 12428-35, 2006 May 05.

Artigo em Inglês | MEDLINE | ID: mdl-16537544

RESUMO

We used protein extracts from proliferating human HeLa cells to support plasmid DNA replication in vitro. An extract with soluble nuclear proteins contains the major replicative chain elongation functions, whereas a high salt extract from isolated nuclei contains the proteins for initiation. Among the initiator proteins active in vitro are the origin recognition complex (ORC) and Mcm proteins. Recombinant Orc1 protein stimulates in vitro replication presumably in place of endogenous Orc1 that is known to be present in suboptimal amounts in HeLa cell nuclei. Partially purified endogenous ORC, but not recombinant ORC, is able to rescue immunodepleted nuclear extracts. Plasmid replication in the in vitro replication system is slow and of limited efficiency but robust enough to serve as a basis to investigate the formation of functional pre-replication complexes under biochemically defined conditions.

Assuntos

Replicação do DNA , Proteína 1 de Manutenção de Minicromossomo/metabolismo , Complexo de Reconhecimento de Origem , Animais , Núcleo Celular/metabolismo , Sistema Livre de Células , Proteínas de Ligação a DNA/química , Células HeLa , Humanos , Insetos , Proteínas Nucleares/química , Fosforilação , Plasmídeos/metabolismo , Proteínas Recombinantes/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA